adding CombineInputFileFormat; only single use case so far by thomasstorm · Pull Request #3 · bcdev/calvalus2

thomasstorm · 2019-07-15T15:42:15Z

No description provided.

martin-boettcher · 2019-11-22T16:58:51Z

...ocessing/src/main/java/com/bc/calvalus/processing/executable/ExecutableProcessorAdapter.java

        File inputFile = getInputFile();
-        if (inputFile == null) {
+        for (Path inputPath : inputPaths) {
            inputFile = CalvalusProductIO.copyFileToLocal(inputPath, getConfiguration());
-            setInputFile(inputFile);
+            if (inputFile == null) {
+                setInputFile(inputFile);
+            }
        }


Here, it seems the logic has unintentionally changed. The test for null was located before the second assignment of the copyFileToLocal before, and will never be true now.

martin-boettcher · 2019-11-22T17:10:05Z

calvalus-processing/src/main/java/com/bc/calvalus/processing/l2/CombineFileInputFormat.java

+/**
+ * @author thomas
+ */
+public class CombineFileInputFormat extends InputFormat {


Is there a relation to org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat ?

martin-boettcher · 2019-11-22T17:17:32Z

calvalus-processing/src/main/java/com/bc/calvalus/processing/l2/CombineFileInputFormat.java

+     * Creates a single split from a given pattern
+     */
+    @Override
+    public List<InputSplit> getSplits(JobContext context) throws IOException {


What about our other methods to determine inputs, in particular those using the geo-inventory? I know that PatternBasedInputFormat needs refactoring and decomposition but I think the other ways to determine inputs are required.
Thinking of how to refactor PatternBasedInputFormat it may be good to distinguish the way the inputs shall be determined (geo-inventory, opensearch query, path pattern, ...) by different classes as they have different parameters anyway, and whichever parameter is specified the client could automatically select the right class. Then, we could either derive a class for CombineFileSplit generation from each of them, or we make this a parameter. In any case, the old PatternBasedInputFormat could delegate the getSplits() call to the new implementations to keep backwards compatibility.

adding CombineInputFileFormat; only single use case so far

9e90236

thomasstorm requested a review from martin-boettcher July 15, 2019 15:42

thomasstorm self-assigned this Jul 15, 2019

removed debug outputs

b933f34

martin-boettcher requested changes Nov 22, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding CombineInputFileFormat; only single use case so far#3

adding CombineInputFileFormat; only single use case so far#3
thomasstorm wants to merge 2 commits intomasterfrom
combine-file-input-format

thomasstorm commented Jul 15, 2019

Uh oh!

martin-boettcher Nov 22, 2019

Uh oh!

martin-boettcher Nov 22, 2019

Uh oh!

martin-boettcher Nov 22, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thomasstorm commented Jul 15, 2019

Uh oh!

martin-boettcher Nov 22, 2019

Choose a reason for hiding this comment

Uh oh!

martin-boettcher Nov 22, 2019

Choose a reason for hiding this comment

Uh oh!

martin-boettcher Nov 22, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants